Search CORE

197 research outputs found

Optimal Strategies in Infinite-state Stochastic Reachability Games

Author: A. Condon
D. A. Martin
Giovanna D'Agostino
H. Gimbert
J. Esparza
K. Etessami
M. L. Puterman
N. Berger
Salvatore La Torre
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
Václav Brožek
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2011
Field of study

We consider perfect-information reachability stochastic games for 2 players on infinite graphs. We identify a subclass of such games, and prove two interesting properties of it: first, Player Max always has optimal strategies in games from this subclass, and second, these games are strongly determined. The subclass is defined by the property that the set of all values can only have one accumulation point -- 0. Our results nicely mirror recent results for finitely-branching games, where, on the contrary, Player Min always has optimal strategies. However, our proof methods are substantially different, because the roles of the players are not symmetric. We also do not restrict the branching of the games. Finally, we apply our results in the context of recently studied One-Counter stochastic games

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Minimizing Running Costs in Consumption Systems

Author: A. Kučera
K. Chatterjee
K. Chatterjee
P. Bouyer
T. Brázdil
T. Brázdil
U. Fahrenberg
Publication venue
Publication date: 01/01/2014
Field of study

A standard approach to optimizing long-run running costs of discrete systems is based on minimizing the mean-payoff, i.e., the long-run average amount of resources ("energy") consumed per transition. However, this approach inherently assumes that the energy source has an unbounded capacity, which is not always realistic. For example, an autonomous robotic device has a battery of finite capacity that has to be recharged periodically, and the total amount of energy consumed between two successive charging cycles is bounded by the capacity. Hence, a controller minimizing the mean-payoff must obey this restriction. In this paper we study the controller synthesis problem for consumption systems with a finite battery capacity, where the task of the controller is to minimize the mean-payoff while preserving the functionality of the system encoded by a given linear-time property. We show that an optimal controller always exists, and it may either need only finite memory or require infinite memory (it is decidable in polynomial time which of the two cases holds). Further, we show how to compute an effective description of an optimal controller in polynomial time. Finally, we consider the limit values achievable by larger and larger battery capacity, show that these values are computable in polynomial time, and we also analyze the corresponding rate of convergence. To the best of our knowledge, these are the first results about optimizing the long-run running costs in systems with bounded energy stores.Comment: 32 pages, corrections of typos and minor omission

arXiv.org e-Print Archive

Crossref

Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

Author: A Kučera
C Baier
C Courcoubetis
E Altman
J Kemeny
M Kwiatkowska
M Kwiatkowska
T Brázdil
T Brázdil
T Brázdil
V Forejt
XC Ding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2017
Field of study

Markov decision processes (MDPs) are the standard formalism for modelling sequential decision making in stochastic environments. Policy synthesis addresses the problem of how to control or limit the decisions an agent makes so that a given specification is met. In this paper we consider PCTL*, the probabilistic counterpart of CTL*, as the specification language. Because in general the policy synthesis problem for PCTL* is undecidable, we restrict to policies whose execution history memory is finitely bounded a priori. Surprisingly, no algorithm for policy synthesis for this natural and expressive framework has been developed so far. We close this gap and describe a tableau-based algorithm that, given an MDP and a PCTL* specification, derives in a non-deterministic way a system of (possibly nonlinear) equalities and inequalities. The solutions of this system, if any, describe the desired (stochastic) policies. Our main result in this paper is the correctness of our method, i.e., soundness, completeness and termination.Comment: This is a long version of a conference paper published at TABLEAUX 2017. It contains proofs of the main results and fixes a bug. See the footnote on page 1 for detail

arXiv.org e-Print Archive

Crossref

Analyzing probabilistic pushdown automata

Author: A Bouajjani
A Tarski
A Tarski
Antonín Kučera
C Baier
D Grigoriev
D Williams
E Allender
E Emerson
H Hansson
I Walukiewicz
J Canny
J Esparza
J Esparza
J Esparza
J Esparza
J Hopcroft
J Rosenthal
Javier Esparza
K Athreya
K Chung
K Etessami
K Etessami
K Etessami
K Etessami
K Etessami
K Etessami
L Bozzelli
R Alur
R Alur
R Mayr
S Kiefer
Stefan Kiefer
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Brázdil
T Harris
Tomáš Brázdil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The paper gives a summary of the existing results about algorithmic analysis of probabilistic pushdown automata and their subclasses.V článku je podán přehled známých výsledků o pravděpodobnostních zásobníkových automatech a některých jejich podtřídách

CiteSeerX

Crossref

Univerzitní repozitář Masarykovy univerzity

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 31/08/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref

Optimizing Performance of Continuous-Time Stochastic Systems using Timeout Synthesis

Author: C Baier
C Haase
C Lindemann
CC Guet
EM Hahn
H Choi
JR Norris
K Etessami
K Ramamritham
L Carnevali
M Kwiatkowska
MA Marsan
MF Neuts
ML Puterman
NC Audsley
P Buchholz
R Alur
R Obermaisser
SK Jha
T Brázdil
T Brázdil
T Brázdil
W Feller
Ĺ Korenčiak
Publication venue
Publication date: 15/04/2016
Field of study

We consider parametric version of fixed-delay continuous-time Markov chains (or equivalently deterministic and stochastic Petri nets, DSPN) where fixed-delay transitions are specified by parameters, rather than concrete values. Our goal is to synthesize values of these parameters that, for a given cost function, minimise expected total cost incurred before reaching a given set of target states. We show that under mild assumptions, optimal values of parameters can be effectively approximated using translation to a Markov decision process (MDP) whose actions correspond to discretized values of these parameters

arXiv.org e-Print Archive

Crossref

Approximating the Termination Value of One-Counter MDPs and Stochastic Games

Author: G.R. Grimmett
J. Lambert
K. Etessami
K. Etessami
L.B. White
M.L. Puterman
T. Brázdil
Publication venue
Publication date: 01/01/2011
Field of study

One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value = 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in P-time (in NP and coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value ? p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given a OC-SSG, and given epsilon > 0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain LPs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs.Comment: 35 pages, 1 figure, full version of a paper presented at ICALP 2011, invited for submission to Information and Computatio

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Zero-Reachability in Probabilistic Multi-Counter Automata

Author: Abdulla P. A.
Baier C.
Bozzelli L.
Brázdil T.
Iyer S.
Kemeny J.
Minsky M.
Publication venue
Publication date: 01/01/2014
Field of study

We study the qualitative and quantitative zero-reachability problem in probabilistic multi-counter systems. We identify the undecidable variants of the problems, and then we concentrate on the remaining two cases. In the first case, when we are interested in the probability of all runs that visit zero in some counter, we show that the qualitative zero-reachability is decidable in time which is polynomial in the size of a given pMC and doubly exponential in the number of counters. Further, we show that the probability of all zero-reaching runs can be effectively approximated up to an arbitrarily small given error epsilon > 0 in time which is polynomial in log(epsilon), exponential in the size of a given pMC, and doubly exponential in the number of counters. In the second case, we are interested in the probability of all runs that visit zero in some counter different from the last counter. Here we show that the qualitative zero-reachability is decidable and SquareRootSum-hard, and the probability of all zero-reaching runs can be effectively approximated up to an arbitrarily small given error epsilon > 0 (these result applies to pMC satisfying a suitable technical condition that can be verified in polynomial time). The proof techniques invented in the second case allow to construct counterexamples for some classical results about ergodicity in stochastic Petri nets.Comment: 20 page

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms

Author: A Jovanovic
A Jovanović
C Haase
C Lindemann
DLP Minh
DP Bertsekas
EG Amparore
EM Hahn
H Choi
JR Norris
L Alfaro
L-M Traonouez
M Češka
ML Puterman
PJ Haas
R German
SK Jha
T Brázdil
T Brázdil
W Nelson
Publication venue
Publication date: 20/06/2017
Field of study

Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the

\varepsilon

-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presented. Our approach is based on reduction of the problem to finding long-run average optimal strategies in semi-Markov decision processes (semi-MDPs) and sufficient discretization of parameter (i.e., action) space. Since the set of actions in the discretized semi-MDP can be very large, a straightforward approach based on explicit action-space construction fails to solve even simple instances of the problem. The presented algorithm uses an enhanced policy iteration on symbolic representations of the action space. The soundness of the algorithm is established for parametric ACTMCs with alarm-event distributions satisfying four mild assumptions that are shown to hold for uniform, Dirac and Weibull distributions in particular, but are satisfied for many other distributions as well. An experimental implementation shows that the symbolic technique substantially improves the efficiency of the synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference on Quantitative Evaluation of SysTems (QEST) 201

arXiv.org e-Print Archive

Crossref

Weak MSO+U with Path Quantifiers over Infinite Trees

Author: M. Bojańczyk
M. Bojańczyk
M. Bojańczyk
M. Boom Vanden
M.Y. Vardi
O. Kupferman
S. Hummel
S. Toruńczyk
T. Brázdil
T. Colcombet
T. Colcombet
Publication venue
Publication date: 01/01/2014
Field of study

This paper shows that over infinite trees, satisfiability is decidable for weak monadic second-order logic extended by the unbounding quantifier U and quantification over infinite paths. The proof is by reduction to emptiness for a certain automaton model, while emptiness for the automaton model is decided using profinite trees.Comment: version of an ICALP 2014 paper with appendice

arXiv.org e-Print Archive

Crossref